On the Automatic Enrichment of a Portuguese Wordnet with Dictionary Definitions
نویسندگان
چکیده
Besides synsets and semantic relations, synset glosses are an important feature of wordnets. However, due to the required effort, their creation is sometimes left undone. This happens in Onto.PT, a Portuguese wordnet created automatically, which does not have glosses. In our work, we exploited Portuguese dictionaries to automatically assign definitions to the synsets of Onto.PT. For this purpose, definitions are selected according to their overlap with the context of the synsets. Using three Portuguese dictionaries, more than one third of the Onto.PT synsets have at least one definition, with assignment accuracy close to 80%, which we believe to be interesting results.
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملEnriching a Portuguese WordNet using Synonyms from a Monolingual Dictionary
In this article we present an exploratory approach to enrich a WordNet-like lexical ontology with the synonyms present in a standard monolingual Portuguese dictionary. The dictionary was converted from PDF into XML and senses were automatically identified and annotated. This allowed us to extract them, independently of definitions, and to create sets of synonyms (synsets). These synsets were th...
متن کاملMonolingual and bilingual dictionary approaches to the enrichment of the Spanish WordNet with adjectives
We report on two different approaches to the incorporation of adjectives in Spanish WordNet based on automatic extraction techniques using EuroWordNet and machine-readable dictionaries. We show that a monolingual dictionary approach enables to exploit relations between different parts of speech and enrich the internal structure of the Spanish WordNet, while the methods based on bilingual dictio...
متن کاملLow-Cost Enrichment of Spanish WordNet with Automatically Translated Glosses: Combining General and Specialized Models
This paper studies the enrichment of Spanish WordNet with synset glosses automatically obtained from the English WordNet glosses using a phrase-based Statistical Machine Translation system. We construct the English-Spanish translation system from a parallel corpus of proceedings of the European Parliament, and study how to adapt statistical models to the domain of dictionary definitions. We bui...
متن کاملUsing WordNet for linking UWs to the UNL UW System
This paper presents the work done with the Spanish-UNL dictionary compiled at the Spanish Language Centre in order to enrich the universal words it contained with the supplementary semantic information required to produce a master entries dictionary. Focusing on a subset of the Spanish-UNL dictionary, namely on the substantives it contains, the work has consisted in automatically enrich the uni...
متن کامل